Database Glossary
A
Ad-hoc reporting
The ability to access information from a database to meet information requirements as required. One-off requests not satisfied by pre-set reporting.
Agent
An application that searches the data and sends an alert when a particular pattern is found.
Aggregations
Information stored in a data warehouse in a summarized form. Instead of recording the date and time each time a certain product is sold, the data warehouse could store the quantity of the product sold each hour, each day, or each week. Aggregations are used for two primary reasons:
- To save storage space. Data warehouses can get large. The use of aggregations greatly reduces the space needed to store data.
- To improve the performance of business intelligence tools. When queries run faster they take up less processing time and the users get their information back more quickly.
Alert
A message that is sent automatically by a computer system when a certain situation occurs.
Application
A piece of software designed to meet a specific purpose.
Attribute
A single data item related to a database object. The database schema associates one or more attributes with each database entity.
Example: In the following database table, the attributes are Name, ID, Extension
|
Name |
ID |
Extension |
| Jim | 124 | 7075 |
| Valeri | 128 | 0853 |
| Bob | 192 | 4214 |
B
Business intelligence tools
Software that enables business users to see and use large amounts of complex data.
C
Candidate key
A combination of attributes that can be uniquely used to identify a database record. Each table may have one or more candidate keys. One of these candidate keys is selected as the table primary key.
|
Name |
Age |
SSN |
Phone Extension |
| Rob | 28 | 123-45-6789 | 1242 |
| Amy | 34 | 987-65-4321 | 9281 |
| Elizabeth | 34 | 111-22-3333 | 9312 |
| Jim | 42 | 333-22-1111 | 3214 |
| Mike | 29 | 999-99-9999 | 2314 |
Cardinality
In set theory, cardinality refers to the number of members in the set. When specifically applied to database theory, the cardinality of a table refers to the number of rows (or tuples) contained in a table.
Chief Information Officer (CIO)
The senior executive in a company responsible for information management and for delivering IT services.
Clickstream data
Data regarding web browsing.
Client/Server architecture
A type of network in which computer processing is distributed among many individual PCs (clients) and a more powerful, central computer (server). Clients can share files and retrieve data stored on the server.
Collaborative software
Groupware, such as Lotus Notes or Microsoft Exchange.
D
Data
A series of facts or statements that may have been collected, stored, processed and/or manipulated but have not been organized or placed into context. When data is organized, it becomes information. Information can be processed and used to draw generalized conclusions or knowledge.
Database
A collection of tables. It also often includes forms for entering data, rules for checking and validating data that has been entered, and the format for creating informative reports from the data in the database.
Data cleansing
The process of manipulating stored data to make it more aligned. Implies eliminating duplication, correcting spelling and removing null fields.
Data-based knowledge
Knowledge derived from data through the use of Business Intelligence Tools and the process of Data Warehousing.
Data mart
A database that has the same characteristics as a data warehouse, but is usually smaller and is focused on the data for one division or one workgroup within an enterprise.
Data migration
The process of physically transmitting data from one environment to another.
Data mining
The use of automated data analysis techniques to uncover previously undetected relationships among data items. Data mining often involves the analysis of data stored in a data warehouse. Three of the major data mining techniques are regression, classification and clustering.
Data model
The description of the contents of a database. It includes tables and the relationship between them.
Data quality
The suitability of data for different requirements. When constructing and populating a database the data quality must be checked to ensure that it meets the needs of all the applications.
Data scrubbing
Removing errors and inconsistencies from data being imported into a data warehouse.
Data transformation
The modification of data as it is moved into the data warehouse.
Data warehouse
A data warehouse is a centralized database that captures information from various parts of an organization's business processes. This information can later be analyzed to determine predictive relationships through the use of data mining techniques.
Data warehousing management
The on-going supervision of the data warehousing process.
Database Management System (DBMS)
The software that is used to store, access, and manage data.
Decision support system (DSS)
A computer system designed to assist an organization in making decisions.
Dimension
The separation of data according to different view. A logical designation of related information that applies a hierarchy for access and reporting.
Domain
The set of all allowable values that attribute may assume.
Drill down and drill up
The ability to move between levels of the hierarchy when viewing data with an OLAP browser.
- Drill down - Changing the view of the data to a greater level of detail.
- Drill up - Changing the view of the data to a higher level of aggregation.
E
Enterprise resource planning
An integrated system of operation applications combining logistics, production, distribution, contract and order management, sales forecasting, and financial and HR management.
Entity
A single object about which data can be stored. It is the "subject" of a table. Entities and their interrelationships are modeled through the use of entity-relationship diagrams.
Electronic Data Interchange (EDI)
Electronic transmission of documents through point-to-point connections using a set of standard forms, messages and data elements; this can be via leased lines, private networks or the Internet.
F
Fact table
In a star schema, the central table which contains the individual facts being stored in the database.
Field
The most basic structural unit of a database. It is a container for a piece of data. In most cases, only a single logical piece of data fits in each field.
Flat file
A data file that contains records with no structured relationships. Additional knowledge is required to interpret these files such as the file format properties. Modern database management systems used a more structured approach to file management (such as one defined by the Structured Query Language) and therefore have more complex storage arrangements.
Form
A database form can be used to facilitate database data entry and/or retrieval operations. A database developer/administrator usually designs a form which can then be used by personnel without any specific database skills to perform repetitive tasks.
Front end
Tools that enable users to interact with underlying application processes or more complex programs through a familiar, easy-to-use interface.
G
Gateway
This is the interface between different computer network, (usually translates from one network protocol to another).
Granularity
The level of detail of the facts stored in a data warehouse.
H
Hardware
The magnetic, mechanical and electrical components of a computer and its peripheral devices.
Hierarchy
Organization of data into a logical tree structure.
I
Index
A database feature used for locating data quickly within a table. Indexes are defined by selecting a set of commonly searched attribute(s) on a table and using the appropriate platform-specific mechanism to create an index.
Information technology (IT)
The hardware and software used to process information.
K
Key
A field that contains a unique identifier for each row in a data table. Even though each individual record represents a separate piece of data, some of those records may look identical. A key provides a completely unambiguous way to distinguish between distinct records, and more importantly, serves as a pointer to a particular record in the table. In many cases, data table keys are constructed by simply adding an additional field to function as the key.
L
Legacy system
A computer system that's been around for a while.
Level
The hierarchies in dimensions have levels which can be used to view data at various levels of detail.
- A Time dimension could have levels for Year, Quarter, Month, and Day.
- A Product dimension could have levels for Product Family, Product Category, Product Subcategory, and Product Name.
- A Customer Geography dimension could have levels for Region, Country, District, State, City and Neighborhood.
Local database warehouse
A database that has the same characteristics as a data warehouse, but is usually smaller and is focused on the data for one division or one workgroup within an enterprise.
M
Mainframe
The central processing unit of a large computer, usually receiving input from a number of terminals.
Member
One of the data points for a level of a hierarchy of a dimension.
Metadata
A term that literally means "data about data." This term refers to information about data itself -- perhaps the origin, size, formatting or other characteristics of a data item. In the database field, metadata is essential to understanding and interpreting the contents of a data warehouse.
Microprocessors
Complex electronic circuits that comprise a computer's central information processing unit.
Multidimensional database management system (MDBMS)
A database management system that organizes data multidimensionally.
N
Non-volatile
Data that does not change.
Normalization
The process of structuring relational database schema such that most ambiguity is removed. The stages of normalization are referred to as normal forms and progress from the least restrictive (First Normal Form) through the most restrictive (Fifth Normal Form). Generally, most database designers do not attempt to implement anything higher than Third Normal.
O
OLAP (on-line analytical processing)
The use of computers to analyze an organization's data.
OLAP browser
A tool used for multidimensional (OLAP) browsing.
OLAP system
Term that is used as a synonym for datawarehousing system.
OLTP (online transaction processing)
The use of computers to run the on-going operation of a business.
P
Program
A set of digitally coded definitions and instructions that enables a computer to perform a particular task.
Protocol
The language that one computer uses to communicate with another.
Q
Query
The primary mechanism for retrieving information from a database and consist of questions presented to the database in a predefined format. Many database management systems use the (Structured Query Language) standard query format.
R
Record or a row
The fields in a record provide a complete description of each item in a collection. A record is a unique instance of data about an object or event.
Relationships
The connections between records in different data tables are provided by relationships.
-
One-to-Many Relationship: The most common relationship between two tables. In this situation, precisely one record in data table A is related to a number of records in data table B. The primary key of table A is inserted as a field into table B where it serves as a foreign key. Relationships between tables are always made through keys.
-
One-to-One Relationship: Each record in one table is linked to one and only one record in another table. In many cases, one of the tables is set of data about a subset of the entities in the main table.
-
Many-to-Many Relationship: Table design rounds out the possible ways of designing data relationships. In some situations, multiple entries in a data table are related to multiple entries in another data table. The classic example is a database of classes and students. Each class consists of many students and each student can take many classes, so in designing a database to track student and class information, a many-to-many design is necessary.
Relational database management system (RDBMS)
A Database Management System is based on relational theory. Most modern Database Management Systems (Oracle, Sybase, Microsoft SQL Server) are relational databases. These databases support a standard language - SQL (Structured Query Language).
Replication
The physical copying of data from one database to another.
S
Scale, scalable, and scalability
Having to do with the ability of a computer system or a database to operate efficiently with larger quantities of data. Scalability is often discussed in situations when multiple processors are joined together. The system scales well (or is scalable) if doubling the number of processors also doubles the speed at which the system performs its tasks. The extra work involved in coordinating larger systems usually prevents them from being fully scalable - so that going from one to two processors would increase the total speed by less than a factor of two.
Schema
The logical organization of data in a database.
Shared dimension
In Microsoft Analysis Services, a dimension used by more than one cube.
Slowly changing dimensions (SCD)
A dimension that has levels or attributes that are changing on an occasional basis.
Software
The programs that are run on a computer system.
Star schema (business definition)
A method of organizing information in a data warehouse that allows the business information to be viewed from many perspectives. The star is a picture of the way the data is being stored. The basic factual information is in the middle of the star. The points of the star represent various perspectives from which the factual information can be viewed.
Structured query language (SQL)
An industry-standard language used for manipulation of data in a relational database. The major SQL commands of interest to database users are Select, Insert, Join and Update.
Summary tables
Tables used to store summarized or aggregated data.
Supercomputer
An extremely powerful computer; designed to deal with large amounts of data at very high speed, often used for military or scientific tasks.
T
Table
The formal name given to the group of records that contain the elements of the collection. A table normally represents a distinct object (business clients or library books), or an event (product orders or stock prices).
Time-variant data
Data that is identified with a particular time period. Time-variant is one of the original defining characteristics of a data warehouse.
X
XML
XML is the eXtensible Markup Language -- a system created to define other markup languages. For this reason, it can also be referred to as a metalanguage. XML is commonly used on the Internet to create simple methods for the exchange of data among diverse clients.

